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This note proves a loss bound for the exponentially weighted average fore- 
caster with time- varying potential, see p., § 2.3] for context and definitions. 
The present proof gives a better constant in the regret term than Theorem 2.3 
in [1]. This proof first appeared in [2] (Theorem 2), where a more general 
algorithm is considered. Here the proof is rewritten using the notation of [TJ. 

Theorem 1. Assume that the loss function I is convex in the first argument 
and £(p, y) G [0, 1] for all p G V and y G y. For any positive reals rji > r] 2 > 
. . for any n > 1 and for any yi, . . . , y n G y, the regret of the exponentially 
weighted average forecaster with time-varying learning rate r] t satisfies 




(1) 



In particular, for 7] t 




2— , t — 1, . . . , n, we have 



L„ — min L in < Vn\nN . 
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Using the Hoeffding inequality ([U Lemma A.l]), we get 

N 

Wt-i 



i=l 

and thus 

N 



i=l * 

Consider the values 



s . _ g-»7t-lii,t-i+77t-lit-l-|%-iZ] fc=1 % 

and note that 



It 
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Let us show that ^ 7 - =1 -hsj,t < 1 by induction over t. For t = this i 



is 



trivial, since s^o = 1 for all j. Assume that ^, =1 jjSjj-i < 1. Then 



m 
vt-i 



" 1 / " 1 \ — 

< E^- 1 (4) 

i=i \i=l / 

since the function x i— )■ x a is concave and monotone for x > and a G [0, 1] 
and since rjt-i > r) t > 0. Using 01]) to bound the right-hand side of (J3J), we 

get -w^ 1 > jr(si^-i) Vt - 1 ; and combining with (J2J), we get 



Wt_i - N 

X 



i=l 

It remains to note that 



s 



it = / s . t _ 1 )%-i e -vt(-(fi,t,yt)+vti(pt,yt)-Vt/ 8 



and we get J2i=i it s i,t < 1- 

For any i, we have j^s i>n < J2j=i j? s j,n < 1, thus 

1 n 

-^ + , n L n --, n ^, fc < In A, 

fe=l 

and p follows. □ 
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Theorem [T] recommends the learning rate r\ t = a/(4 \nN)/t instead of 
(8 \nN)/t used in Theorem 2.3 in [1] and achieves the regret term \/n\nN 
instead of y/2n In N + V0.1251nJV. 

To compare the bounds for arbitrary learning rates, let us observe that 
the proof of Theorem 2.3 in pQ actually implies (under the assumptions of 
Theorem [[J : 

/ 2 i \ i n 

L n - min L i n < In + - V V . 

i=l,..,JV \Vn ViJ 

The right-hand side of this inequality is larger than the right-hand side of ([T|) 
if r) n 7^ rji. If r\t are equal for all t, the bounds coincide and give the bound 
of Theorem 2.2 in [TJ. 
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